Inferring Psycholinguistic Properties of Words

نویسندگان

  • Gustavo Paetzold
  • Lucia Specia
چکیده

We introduce a bootstrapping algorithm for regression that exploits word embedding models. We use it to infer four psycholinguistic properties of words: Familiarity, Age of Acquisition, Concreteness and Imagery and further populate the MRC Psycholinguistic Database with these properties. The approach achieves 0.88 correlation with humanproduced values and the inferred psycholinguistic features lead to state-of-the-art results when used in a Lexical Simplification task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Lightweight Regression Method to Infer Psycholinguistic Properties for Brazilian Portuguese

Psycholinguistic properties of words have been used in various approaches to Natural Language Processing tasks, such as text simplification and readability assessment. Most of these properties are subjective, involving costly and time-consuming surveys to be gathered. Recent approaches use the limited datasets of psycholinguistic properties to extend them automatically to large lexicons. Howeve...

متن کامل

Language-Independent Prediction of Psycholinguistic Properties of Words

The psycholinguistic properties of words, namely, word familiarity, age of acquisition, concreteness, and imagery, have been reported to be effective for educational natural language-processing tasks. Previous studies on predicting the values of these properties rely on languagedependent features. This paper is the first to propose a practical languageindependent method for predicting such valu...

متن کامل

Collecting and Exploring Everyday Language for Predicting Psycholinguistic Properties of Words

Exploring language usage through frequency analysis in large corpora is a defining feature in most recent work in corpus and computational linguistics. From a psycholinguistic perspective, however, the corpora used in these contributions are often not representative of language usage: they are either domain-specific, limited in size, or extracted from unreliable sources. In an effort to address...

متن کامل

Psycholinguistic Ambiance of Short Stories in Enhancing Students’ Reading Comprehension and Vocabulary Power

Abstract The present study was carried out to investigate the effect of short stories on students’ reading comprehension, vocabulary power and attitude towards the skill and the new instructional materials. The participants of the study were 120 grade 9 students of Dilla Secondary and preparatory school. In order to gather data for the study, pre- and posttest of reading comprehension, pre and ...

متن کامل

Planning and production of grammatical and lexical verbs in multi-word messages

Grammatical words represent the part of grammar that can be most directly contrasted with the lexicon. Aphasiological studies, linguistic theories and psycholinguistic studies suggest that their processing is operated at different stages in speech production. Models of sentence production propose that at the formulation stage, lexical words are processed at the functional level while grammatica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016